Minimal training based semantic categorization in a voice activated question answering (VAQA) system
نویسندگان
چکیده
In this paper, we develop a knowledge based methodology that maps Automatic Speech Recognizer (ASR) transcriptions to predefined semantic categories in a Voice Activated Question Answering (VAQA) system. The proposed semantic categorization methodology, SemCat, uses a novel lexical chains/ontology based algorithm and relies heavily on customized but domain independent Natural Language Processing (NLP) tools and does not require any domain-specific utterance collections or manually annotated text data. SemCat requires minimal manual intervention during training, relying only on the semantics encoded in a brief, manually-created description for each predefined category/slot. SemCat uses these descriptions along with the eXtended WordNet Knowledge Base (XWN-KB) and several domain independent NLP tools including XWN lexical chains to accurately extract information and map user utterances to predefined categories. SemCat also uses the domain ontologies created automatically by the Jaguar knowledge acquisition tool to accurately extract domain/customer specific language/terms.
منابع مشابه
Language Modelization and Categorization for Voice-Activated QA
The interest of the incorporation of voice interfaces to the Question Answering systems has increased in recent years. In this work, we present an approach to the Automatic Speech Recognition component of a Voice-Activated Question Answering system, focusing our interest in building a language model able to include as many relevant words from the document repository as possible, but also repres...
متن کاملBoosting Passage Retrieval through Reuse in Question Answering
Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...
متن کاملAutomatic Construction of Semantic Dictionary for Question Categorization
An automatic method for building a semantic dictionary from existing questions in a pattern-based question answering system is proposed for question categorization. This dictionary consists of two main parts: Semantic Domain Terms (SDT), which is a domain specific term list, and Semantic Labeled Terms (SLT), which contain common terms tagged with semantic labels. The semantic dictionary is buil...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملQASR: question answering using semantic roles for speech interface
In this paper, we evaluate a semantic role labeling approach to the extraction of answers in the open domain question answering task. We show that this technique especially improves the system performance when answers are communicated to the user by voice. Semantic role labeling identifies predicates and semantic argument phrases in a sentence. With this information we are able to analyze and e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008